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Abstract Decision support systems enable users to quickly assess data, but they 
require significant resources to develop and are often relevant to limited domains. 
This chapter identifies the implicit assumptions that require contextual analysis for 
decision support systems to be effective for providing a relevant threat assessment. 
The impacts of the design and user assumptions are related to intelligence errors 
and intelligence failures that come from a misrepresentation of context. The intent 
of this chapter is twofold. The first is to enable system users to characterize trust 
using the decision support system by establishing the context of the decision. The 
second is to show technology designers how their design decisions impact system 
integration and usability. We organize the contextual information for threat analysis 
by categorizing six assumptions: (1) specific problem, (2) acquirable data, (3) use of 
context, (4) reproducible analysis, (5) actionable intelligence, and (6) quantifiable 
decision making. The chapter concludes with a quantitative example of context 
assessment for threat analysis. 


Keywords High-level information fusion - Situation assessment - Threat assess- 
ment - Context - Timeliness - Uncertainty + Unknowns 


5.1 Introduction 


A threat is an assessment that an individual or group has the potential to cause harm 
to specific entity or entities. Threat assessment has three parameters: intent, 
capacity, and knowledge or intent, capability, or opportunity [1]. During the Cold 
War, sovereign nations engaged other sovereign nations using military-specific 
vehicles operating in collaborative groups. The battle groups were centrally 


S.A. Israel 
Raytheon, Chantilly, VA, USA 
e-mail: Steven.a.Israel@Raytheon.com 


E. Blasch (Ù<) 
Air Force Research Lab, Rome, NY, USA 
e-mail: erik.blasch @ gmail.com 


© Springer International Publishing Switzerland (outside the USA) 2016 99 
L. Snidaro et al. (eds.), Context-Enhanced Information Fusion, 

Advances in Computer Vision and Pattern Recognition, 

DOI 10.1007/978-3-319-28971-7_5 


100 S.A. Israel and E. Blasch 


Response Motion GSD rate FOV 


Time Complexity 











Tasking 
Situational 
awareness 









Monitor facilities 


Track vehicles 






Track humans 


ID, track and 
prosecute 





Fig. 5.1 Image quality parameters versus tasks: courtesy of David Cannon [5] 


coordinated and positioned away from civilian activities to maximize their 
maneuverability [2]. 

The Cold War military performed directed data collection, which means that 
they maintained custody of the information throughout the stovepiped exploitation 
chain. Enemy intent and capacity were based upon knowledge of the leaders, 
military strength and readiness, and doctrine. For example, the Cold War threats 
were so well understood that the required information and analyst tasks determined 
the design for imaging sensors [3, 4]. Figure 5.1 identifies that design relationship 
including ground sampling distance (GSD) and field of view (FOV) for the National 
Imagery Interpretability Rating Scale (NIIRS). 

In addition to the traditional Cold War threats, threats to sovereign nations also 
include: organized crime, narcotics trafficking, terrorism, information warfare, and 
weapons of mass destructions (WMD) [6]. Non-national actors pose different 
threats in the following manner: (1) there is no identifiable battlefront; 
(2) non-national actors keep and garrison few if any pieces of heavy military 
hardware, rocket launchers, tanks, etc., which both reduces their physical signature 
and minimizes their liabilities; (3) they maintain no persistent doctrine; (4) their 
numbers and actions form only a small fraction of a percentage of the resident 
population; and (5) they dictate attacks in the political, financial, cyber, and cultural 
domains in addition to the geospatial, when their opportunity for success is greatest 
[7-9]. 

One example of a terrorist event is the bombing during the 2013 Boston 
Marathon. The bomber’s intent was to destabilize the public trust. The bomber’s 
capacity was a small amount of funds and two individuals. The bomber’s technical 
knowledge was in home-made explosives and the operational knowledge of the 
crowd movement during the marathon to maximize their impact. 

The remainder of this chapter is laid in the following manner. Threats to 
sovereign nations are defined. The common elements of those threats and their 
impacts on decision supports systems are identified. The assumption used by 
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decision support system developers are made explicit. Finally, an example of how 
the developer assumptions can be quantified using evidence theory is performed. 


5.2 Defining Threats 


5.2.1 Threat Assessment 


To identify the threat’s intent, capacity, and knowledge, analysts seek information 
from four basic knowledge types (Table 5.1): entity knowledge provides the static 
who or what, where, and when information; the activity or transaction knowledge 
provides dynamic components for how; association knowledge provides with whom 
and link method information; and finally context knowledge provides why infor- 
mation. Using these information types, the analyst seeks to answer the following: 


e Is the threat credible? 
Who are the individuals or groups composing the threat? 
What is the impact and likelihood of threat against individuals, entities, and 
locations? 

e How has the threat evolved since the previous assessment? 


Table 5.1 Diversity of knowledge types 


Information 


Description Example questions Metadata 
level 


Entity Static target, 
noun: person, 
car, building, 


website, idea 
Entity 
performing 
action 


Activity/Event 


Functional 
relationship 
among entities 


Association 


Conditions 
under which 
entity interacts 
within its 
environment 


Context 


Determine type of target, 
location, and time: where, 
what, and when? 


Tracking entity, routes, 
estimating traffic patterns, 
transactions, volume, 
changes: where’s it 
going, is it moving with 
the rest of traffic, how 
many file downloads? 


Network, membership, 
purpose: who are the 
friends of the entity, what 
is the purpose for their 
association? 


Determine 
activity/event/transaction 
purpose along with 
tactics, techniques, and 
procedures: why’? 





Name, work, ownership, 
membership, address, 
area extent, topic, and 
content 


Traffic volume, direction, 
diversity, mode, domain 
type (financial, physical, 
social media), 
coordinated activities, 
criminal acts, and daily 
commute 


Interpersonal (family, 
friends, employer), social 
interactions (people, 
places), topic, purpose, 
accessibility, cost, and 
transaction type 


Culture, geography, cost, 
politics, subject, history, 
religion, social 
interaction, availability, 
and access 
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Fig. 5.2 Knowledge types for evidence in the human environment 


Information from one knowledge type can be used to cue another (Fig. 5.2). 
Evidence is data or rules about individuals or other entities, activities/transactions, 
associations, and context used to characterize a threat. Evidence accumulation is 
conceptualized as building a legal case rather than the Cold War target prosecution 
[10]. Evidence can take the form of direct or circumstantial. Direct evidence links a 
signature (entity, activity, association) to known actor(s) or entities; i.e., labeled 
data. Circumstantial evidence requires an inference to link information to an entity. 

Activity and entity information can be nested to describe transactions and events. 
Transactions are linked activities, where information or materials are passed. Events 
are related activities occurring over a given domain and time [11]. 

Information from the four knowledge types is now being exploited by corpo- 
rations and private citizens. Intelligence can be sold to advertisers; used for boot- 
strapping on other types of attacks, business espionage, and generation of 
high-quality predictions of future activities [12]. The majority of these data are 
provided willingly and unconsciously by the public [13]. 


5.2.2 Threat Assessments Should Have Unique System 
Requirements 


Intelligence questions can be broken into three basic categories: assessment, dis- 
covery, and prediction [14]. Though the focus of this chapter is threat assessment, 
many of the concepts are applicable to discovery and prediction. To perform threat 
assessment, evidence accumulation must be structured to track activities of indi- 
viduals independent of collection mechanism [15]. Individuals may be cooperative, 
such as member of online social networks that provide a wide range of personal 
information; noncooperative individuals limit their public footprint; and uncoop- 
erative individuals actively seek to defeat attempts of their signature being 
collected. 
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Jonas [16] suggested the following traits that a decision support system should 


possess. 


Sequence neutral processing: knowledge is extracted as it becomes available 
and assessed as evidence immediately. Note: the system must be cognizant that 
data may arrive out of order from when it was collected. 


— The decision and confidence may change with time as additional confirming 
and rejecting evidence are reported. 


Raw data must be processed only once [|17], because access, 
collection-evaluation, and transmission of data generate a tremendous compu- 
tational, storage and network burden due to the 5V (volume, velocity, veracity, 
variety, and value) issues. 

Relationship aware: links among individuals to either known or discovered 
individuals become part of the entity meta-data. 

Extensible: system must be able to accept new data sources and attributes 
Knowledge-based thesaurus: support functions exist to handle noise when 
comparing queries to databases. 


— Cultural issues such as transliteration of names or moving from the formal to 
the informal. 

— Imprecision such as a georeference being a relative position rather than an 
absolute location; i.e., over there versus a specific latitude and longitude 
[18]. 

— Text, rhetoric, and grammar change often and the change rate is even faster 
in social media than more formal communications such as broadcast news. 


Real-time: changes must be processed on the fly with decisions happening in an 
actionable timeline; i.e., online learning. 


— Perpetual analytics: no latency in alert generation. 


Scalable: able to expand based upon number of records, users, or sources. 


5.3 Assumptions for Decision Support Systems 


The remainder of this chapter describes the assumptions for threat assessment 
decision support system. Figure 5.3 is an engineering functional block diagram for a 
generic information exploitation system. For a given problem statement, there are 
assumptions included in the threat assessment. These assumptions are organized 
into the Data Fusion Information Group (DFIG) model levels (L1 ... L5) of 
information fusion. Together, the assumptions along the processing chain are 
included in the generated information that accompanies a threat decision. However, 
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Fig. 5.3 Assumptions within the human environment 


there must be a reality vector that translates the decision into the required infor- 
mation. The ops tempo determines the amount of context that can be accurately 
relayed in the assumptions that accompany a decision. At each functional block, the 
common assumptions made by users or technology developers are made explicit 
[19]. Within each section, the assumption is further resolved. 

Each assumption in Fig. 5.3 contributes to intelligence errors and intelligence 
failures [20]. Intelligence failure is the systemic organizational surprise resulting 
from incorrect, missing, discarded, or inadequate hypotheses. Intelligence errors are 
factual inaccuracies in analysis resulting from poor or missing data. Though this 
chapter focuses on threats to governments [21], the concepts are applicable for 
understanding threats within social networks [22], by criminals [23], and to 
financial systems [24]. 


Assumption 1 The Problem is Specific 


Assumption 1: The Problem Statement is Specific 

The problem statement in specific assumes that the decision support system’s 
output relates to the problem statement [25], which is noted in Fig. 5.3 as the 
reality vector. The problem statement assumption asks fundamental ques- 
tions: Can the threat be described as a question or hypothesis? Is the decision 
relevant the question? 


Assumption 1.1 Can the Threat be described as a Question or a Hypothesis? 
The first part is to understand the type of question being asked. Asking the right 
question relates directly to context. For current insurgent warfare [2], nations face 


5 Context Assumptions for Threat Assessment Systems 105 


threats from a number of groups each with different outcome intent, capacity, and 
knowledge as shown in Fig. 5.4. This uncertainty in the enemy probably led Donald 
Rumsfeld [26] to state the following: 


There are known knowns; there are things we know that we know. 
There are known unknowns; that is to say there are things that, we now know 
we don’t know. 

e But there are also unknown unknowns—there are things we do not know we 
don’t know. 


Treverton [27] described this taxonomy as puzzles, mysteries, and complexities. 
Figure 5.4 highlights the ability to translate unknowns into knows. The first case, 
and obvious to information fusion is a data-driven approach in which the perceived 
unknowns are mapped to perceived knowns (whether reality has been satisfied). For 
example, collections can verify that the perceived unknown is still unknown. The 
second case 1s a knowledge-driven in which the unknown reality is moved to a 
known reality. To make things known, context-driven approaches match the 
unknown perceived unknowns into reality through evidence analysis. 

The next part of the question is to understand blindspots. Originally, analysts 
assumed that threat networks consisted of a central hierarchical authority. Analysts 
would then look for evidence of a kingpin and assess their capacity to do harm, 
which is similar to the Federal Bureau of Investigation (FBI) combating organized 
crime in the 1950s and 1960s [23, 28]. Although this paradigm might have been 
prevalent prior to the 9/11 attacks [29], Al Qaeda and its confederates moved away 
from that model shortly afterward [2]. Current threat networks are transient based 
upon opportunity and mutual interests [30]. 
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There is no clear solution for how to ask the right question or even that having 
the right information guarantees success. For example, given a chess board arran- 
ged in the normal starting position, no single opening move exists for the white 
player that guarantees a win even though (s)he has perfect situational awareness. 
The strategy to chess is to play the game until a small number of alternatives exist 
before taking finishing action. The same strategy is essential for assessing and 
countering threats (Fig. 5.5) [31]. 


Assumption 1.2 Is the Decision a Relevant Question? 

Analytical workflows commonly focus on specific data modalities, exploitation 
techniques. The reliance on existing processing chains has a number of causes. The 
first cause is mechanical; sensor data have known workflows. Their output products 
have known and quantifiable performance metrics. The second cause is organiza- 
tional inertia; adopting new business processes takes strong leadership for change 
and involves risk. The third cause is the lack of resources [32]: the number and skill 
set for analysts are very focused among a relatively small cadre [33]. The fourth 
cause is changing any element in the exploitation chain requires training and a 
learning timeline which is a large investment of time, money, and most likely a 
near-term reduction in performance. The fifth cause is that though a new or different 
knowledge source may contain sufficient information content, its technological 
readiness could be insufficient for operational usage. 


To test the problem statement, all evidence must be structured to either confirm it 
or reject it. Therefore, individuals who generate problem statements must also 
understand the structure of the output. The downstream cost is the burden of 
transforming the data prior to analysis. 

Currently, evidence accumulation is a manual, cognitive process. However, 
analysts spend much of their time locating data sources than assessing information. 
Government and industry have problems federating disparate data repositories and 
resolving entities across those systems. Other issues facing the analysts are that 
their customer bases and product diversity are increasing. Another unfortunate 
circumstance for the current generation of analysts is that the timelines have 
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shortened and they rarely have the time to perform their after action reviews 
(AARs) to assess system performance and usability. 

Johnston [20] produced a series of tools and techniques to address the issues 
stated by Rumsfeld, which include questioning the foundation assumptions, looking 
for precursor actions, alternative analysis, etc. For example, black-hatting friendly 
capabilities which includes a hacker who violates computer security for little reason 
beyond mischievous or satisfaction behavior. Other researchers are rediscovering 
that the critical actors that enhance threat capacity are those individuals and entities 
with unique skills and capabilities that arrives just-in-time, 1.e., the strength of weak 
ties [34]. 


Assumption 2 Context Data can be Acquired 


Assumption 2: Context Data can be Acquired to Fill Knowledge Gaps 
The assumption that data can be acquired to fill knowledge gaps is a holdover 
from the directed collections of the Cold War. The purpose for data collection 
is to improve decision confidence above some threshold. Many data streams 
are continually generating information, so the context is dynamic. So, data 
collection is less important than continually trolling known databases for new 
content or determining the location of relevant data sources. Data acquisition 
assumes a number of issues: data collection is unbiased, target signatures are 
constant, data quality can be determined, and all the information is collected 
[35]. 


Assumption 2.1 Data Collection is Unbiased 

Nondirected data sources have diverse origins and their chain of custody is 
incomplete. The provenance links may also contain a level of uncertainty, which 
reduces the trustworthiness of the source [36, 37]. Although the total amount of data 
is large, the amount of data available as evidence may be sparse for a specific 
problem set, location, or entity. 


Assumption 2.2 Target Signatures are Constant 

Target signatures are the information types (entity, activity, association, or context) 
that describe an individual within a domain (geospatial, financial, cyber, etc.). The 
assumption has two basic components. First, an individual’s or entity’s interactions 
with their environment are invariant over time and space. Second, observed activity 
has a known and constant meaning. Interpreting activities is difficult because they 
vary with: 


e External stressors: such as the arrest of a threat network member, will cause a 
change in the Tactics, Techniques, and Procedures (TTPs) of the group, ala 
Maslow’s hierarchy. Yet, the network itself may remain intact [38]. 

e Not all threat activities are anomalies; and not all anomalies are threats. 
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e Cultural difference within a population: Eagle [39] showed that the indi- 
vidual’s use of communication is a function of their anthropological attributes as 
well as network strength and stability. 

e Type and size of network: Members of a threat network are also members of 
the general population [40]. The majority of the threat individual’s actions are 
benign. Therefore, even knowing that an individual is part of a threat network, 
determining which of their actions contributes to a threat is difficult. 

e Anonymity: Threat actors in the cyber domain may usurp authorized user’s 
identity [41]. Identity theft is commonplace in financial transactions even with 
tokens and passwords, 1.e., credit cards and online banking [42]. 


Sakharova [24] documented the change in Al Qaeda’s financial transactions 
since 9/11. Originally, the group was highly centralized using commercial banking 
institutions, money laundering techniques, and countries with lax laws and poor 
banking oversight. As western countries cracked down on their legitimate banking 
operations, the group changed tactics to holding and transferring money in fixed 
commodities such as gold. Alternatively, these groups used the more traditional 
Islamic money transfer method of Hawala, which is comparable to Western Union 
transfers using trusted, usually unaffiliated, individuals without formal 
record-keeping. 

To mitigate the effect of changing target signatures, analysts attempt identify 
individuals across all domains in which they operate. The tracking process is called 
certainty of presence. Certainty of presence has the added benefit to discover when 
a signature for a particular entity is no longer valid in a given domain. Though 
membership within modern threat networks are based on mutual gains, individuals 
generally interact among those who they trust and have deep ties [43, 44]. 


Assumption 2.3 Data Quality is Measureable 

Data quality deals with the accuracy and precision of each data source [45]. For 
many directed sensors, the inherent data quality can be computed by convolving 
target, sensor, and environmental parameters [46] (Fig. 5.6). However, nondirected 
and nonsensor data have aspects of human interactions that include missing attri- 
butes, incorrect or vague inputs, and even ill-defined attribute classes. Incorrect or 
incomplete data could be due to human input errors, such as day/month/year 
variations or even leading zeros. Depending upon the context, incorrect information 
could be an indicator of hostile activity; 1.e., deliberate malfeasance. 


Human interactions make digital data, cyber in particular, suspect as evidence 
because: (1) Altering digital records is easy and the chain of custody is difficult to 
confirm; (2) Forensic data review may not yield information about file manipula- 
tion; (3) Lack of standards for the collection, verification, exploitation, and pre- 
serving digital evidence; (4) The 5Vs make the organization, scanning, and sifting 
functions by investigators difficult for determining the responsible party for the 
digital attack; and (5) Assigning the information to a unique individual is difficult to 
prove [21]. 
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Fig. 5.6 Operating quality 
conditions affecting data 
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Assumption 2.4 All Knowledge is Collected 

This assumption assumes that analysts have access to all the directed and nondi- 
rectional data collection and that those data contain all threat information. In reality, 
however, users only know the volume of data they can access and are most likely 
unable to estimate the amount of missing information. The assumption is that the 
available information can fully describe the threat. The cost of false alarms can be 
computed and related to intelligence errors. However, the cost of missing evidence 
cannot be computed and most likely to lead to surprise—intelligence failures. 


Assumption 3 Context Data can be Fused 


Assumption 3: Data can be Fused 

The fundamental goal for data fusion is to develop discrete decision on a 
threat assessment. Fusing disparate data can add error as to whether the 
observations relate to a common entity, activity, or association [47]. As the 
amount of evidence increases, these uncertainties are expected to resolve. 
Two fundamental assumptions associated with data fusion are: the data fusion 
strategy is fixed and knowledge can be abstracted to different resolutions, 
which require context (or for that matter the right context) to change fusion 
strategies to produce the correct fidelity. 


Assumption 3.1 The Data Fusion Strategy is Fixed 

This discussion parallels the relevance of the decision process from Assumption 1. 
Since the combination of intent, capacity, and knowledge is unique for each threat, 
there is no expectation that that a specific data type can be collected [48-50]. 
Information Fusion is the interaction of sensor, user, and mission [51] for situation 
and threat assessment [52]. Challenges for information fusion [53] include the 
design of systems to identify and semantically classify threats as information 
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exploitation as information management [54]. The integration should be based upon 
the constraints of the data streams (Fig. 5.7). Many constraints exist for data level 
integration that require the individual sources to be aligned in space and time, 
classically called data fusion. Usually, only image data are layered in this fashion. 
More commonly, attribute/feature integration is performed where the data are only 
constrained by time or space. However, for threat information there must be rele- 
vant features that come from threat concepts for a given threat event identification. 


Data fusion errors include the duplication of information across fields, fields 
incorrectly populated, and extensive use of unstructured data. Time stamps con- 
tribute to misregistration by either poor definition of the clock or incorrect values. 
To mitigate these issues, background processes are required to test for duplication 
and trustworthiness, which is often described as metadata triage. Information triage 
assesses the individual data streams for information content. 


Assumption 3.2 Knowledge can be abstracted from Other Resolutions 

This assumption states that data of differing resolutions can be combined without a 
loss of information content. Anomaly detection is often performed by observing 
deviations from the norm [55]. If data are generalized to coarser resolution, then the 
observed differences between an anomaly and the normal will be smoothed: pos- 
sibly below a detection threshold. If the data are assigned to higher than collected 
rates, uncertainty creeps into the relationship among entities, activities, or events. 


Assumption 4 Context Decisions are Reproducible 


Assumption 4: Context Decisions are Reproducible 
Decisions are reproducible assumes that the decision making process is robust 
and auditable [56]. The assumptions built into the earlier functional blocks 
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are expressed during decision making. Each piece of evidence’s impact on the 
decision is assessed as it arrives. At decision time, the decision confidence is 
quantified. The assumptions made about the decision process are: threat 
assessment is pattern recognition, the operational context is understood, and 
human decision making is a good model for a computational engine. 


Assumption 4.1 Threat Assessment is Pattern Recognition 
The conventional pattern recognition paradigm contains assumptions that are vio- 
lated by evidence accumulation [57]. 


Threats fall within specific classes, are known a priori, exclusive, and exhaustive 
Data are not perishable 

Knowledge classes are generated offline 

Target signature variation is fully understood 

Performance degrades predictably with signal aberrations 


The reality is that evidence accumulation for threat assessment does not adhere 
to any of the above assumptions, because no two threats are the same. Human 
activities are not independent, but interactive. Therefore, supervised classifiers that 
map input attributes to output classes are not relevant. 

The current threat assessment philosophy is to use anomaly detection. Anomaly 
detection requires a mechanism to continually sample the environment and measure 
normal conditions. Currently researchers use graph theory to map individuals 
within threat networks, and then infer the impact and likelihood [58]. The cost is 
that graph analysis is not computationally scalable. 

Machine decisions require the system to determine both an upper and lower 
evidence threshold, which can be conceptualized as a hypothesis test. The upper 
threshold is to accept the threat hypothesis and alert the user to take action. The 
lower threshold is to reject the hypothesis and alert telling the user that no threat 
exists. Irvine and Israel [59] used Wald [60] sequential evidence to provide evi- 
dence bases using this strategy. 


Assumption 4.2 Operational Context is Understood 

Context is fundamental to decision making [61]. Context is the environment for 
interpreting activities [62]. Prior to the Boston Marathon Bombing, the bomber’s 
activities were consistent with those of the crowd. Even if the authorities were able 
to review the imagery and social media available of the bombers, they had no basis 
to interpret the bomber’s activities as anomalies or threats. After the explosions, the 
context changed as the suspects began to flee Boston when their identities were 
discovered. 
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Assumption 4.3 Human Decision Making is a model for Computational 
Decision Engine 

Humans perform evidence accumulation similar to the model in Fig. 5.5 [63] and 
have specific thresholds for recognition and understanding from which decisions 
are rendered, i.e., the Eureka moment [9, 32, 64—66]. Other uniquely human issues 
also contribute to failure are: 


e Stereotyping based upon consistency, experience, training, or cultural and 
organizational norms 

e Not rejecting hypotheses that do no longer fit the situation; not questioning data 
completeness 

e Evidence evaluation 


— Greater faith placed in evidence that the analyst collected or experienced 
— Absence of evidence = Evidence of absence 
— Inability to incorporate levels of confidence into decision making 


Several research studies have refuted this assumption by relating decision per- 
formance to include reduced timelines, criticality of decision, visibility of decision 
maker, experience, etc. [20, 67-69]. This class of problems are often called 
time-critical decision making. Time-critical decisions in humans are often charac- 
terized by the following: 


e Decreased emphasis on identifying and tracking alternatives 
Exaggerated influence on negative data 
Pieces of available evidence are often missed or not accounted for during the 
decision process 
Tendency toward automated decisions; faster than actually required 
Mistakes tend to grow dramatically even for low-complexity situations 
e Increased time allocated to the wrong step in the decision process 


Analysts operating in a time-critical decision making environment will be 
affected by their personality towards risk; 1.e., being risk-averse, risk-neutral, or risk 
prone. Also, the decision maker’s presence in the environment is a factor along with 
their ability to evaluate the situation. However, the research shows that decision 
making within a stressed environment can be improved through training. The 
training should contain four elements: increasing the individual’s knowledge base, 
develop policies and procedure so the individual has a cognitive look up table, 
perform tasks in simulated stressful environments, and provide cognitive tools for 
handling stress. The goal is to change the decision maker’s process from cognitive 
to automatic [70]. 
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Assumption 5 Context Decisions are Actionable 


Assumption 5: Decisions are Actionable 
Actionable decisions require trust in the decision process, unambiguous 
interpretation of the decision, and time to act. Actionable decision is no 
guarantee of a correct or optimal decision. 


Assumption 5.1 Decision Engines are Trusted 

Trust is a uniquely human concept. Cyber and financial systems have been using 
trust to describe authentication. Measures exist for data quality [71]. However, trust 
for computational decision engines, trust relates to human confidence in the results. 
Trust can be developed by providing decision lineage, where lineage is the audit 
trail for the decision’s entire processing chain. Threat assessment also looks for 
agreement across disparate points of view (political, business, civil, secular, etc.). 
No automated measure has been discovered for this chapter. 


User trust issues then are confidence (correct detection), security (impacts), 
integrity (what you know), dependability (timely), reliable (accurate), controlla- 
bility, familiar (practice and training), and consistent (reliable). 


Assumption 5.2 Decisions are Rendered Unambiguously 

This assumption is the relationship between the rendered evidence and decision 
confidence. Cognitive interpretation of graphical information is a function of 
contrast among elements, graphical complexity, and human experience [72, 73]. 
Graph representations require simplifications to demonstrate relationships [74], 
which may mask other interactions [75, 76]. Ideally rendered decisions will also 
characterize the decision to the closest alternative, relationship to the evidence 
threshold, and that the context is correctly classified. 


Assumption 5.3 Decisions are Timely 

Under ideal conditions, computational decisions are rendered instantly. However, 
computational decisions have the same issues as humans with respect to finite 
timelines [77]. The concept is called time-sensitive computing (Fig. 5.8). Many 
computational applications fall into this realm of conditional performance profiles 
that allow meta-data to control processing time based upon time allocation or input 
quality [78]. So, the algorithms operate until either the performance threshold or the 
available time has been met. 


Assumption 6 Context Errors can be fully Quantified 


Assumption 6: Error can be fully quantified 
Identifying error sources assumes that the system can be decomposed into its 
functions and their components. Then, the combination of the component 
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Fig. 5.8 Data structures for knowledge types time versus decision quality for computational 
strategies (adapted from [78]) 


metrics can be combined to match the system level performance measures 
(Fig. 5.4—Error arrow). Error analysis does not provide any information for 
decision relevance [79]. 


The problems with this assumption are that: (1) Components are often tested 
using their local or domain specific metrics and translation to a global measures are 
either impractical or have no cognitive basis; (2) Metrics often relate to the per- 
formance of an algorithm, called producer’s performance rather than the amount of 
evidence a user must review to make a decision, called users performance; and 
(3) Component-level errors are incorrectly assumed to be uncorrelated. 

While the error analysis leads to incorrect threat analysis, we can assume that the 
threat analysis is pessimistic (e.g., lower bound). It is not that threat should not be 
determined, but rather that the results (with the many assumptions) should error on 
the side of caution. Measures of effectiveness [80] require that the many sources of 
uncertainty be account for in the process. Currently, the International Society of 
Information Evaluation and Testing of Uncertainty Reasoning Working Group 
(ETURWG) [81] is investigating these issues for both context analysis and future 
interoperable standards [82]. 


5.4 Context-Based Threat Example 


The following example shows how the earlier assumptions are accounted for 
quantitatively. In the example, Bayes Rule is used for data fusion and Dempster’s 
Rule is used for evidence accumulation. We seek to address the assumptions: 
(6) quantifiable, (5) actionable, (4) reproducible, (3) use of context data, (2) ac- 
quirable, and (1) specific for which we use evidence theory through Proportional 
Conflict Redistribution (PCR). 

Recently, [83] has shown that Dempster’s rule is consistent with probability 
calculus and Bayesian reasoning if and only if the prior P(X) is uniform. However, 
when the P(X) is not uniform, then Dempster’s rule gives a different result. Yen 
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[84] developed methods to account for nonuniform priors. Others have also tried to 
compare Bayes and evidential reasoning (ER) methods [85]. Assuming that we 
have multiple measurements Z = {Z), Zo, ..., Zn} for cyber detection D being 
monitored, Bayesian and ER methods are developed next. 


5.4.1 Relating Bayes to Evidential Reasoning 


Using the derivation by Dezert [83], assuming conditional independence, one has 
the Bayes method: 


P(X|Z1)P(X|Z2)/P(X) 


P(X|Z,NZ) = SIN, P(Xi1Z1) P(Xi|Z2)/P(Xi) 


(5.1) 


With no information from Z; or Z3, then P(X | Zi, Z2) = P(X). Without Z2, then P 
(X | Zi; Z>) = P(X | Z) and without Z, then P(X | Z], Z2) = P(X | Z2). Using Dezert’ s 
formulation, then the denominator can be expressed as a normalization coefficient: 


m2(Ø)=1- X` P(XiZ)P(X:|Z2) (5.2) 


Xj3Xj|XiNX; 
Using this relation, then the total probability mass of the conflicting information 
1S 


P(X|Z,NZ2) = - P(X|Z1)P(X|Zp) (5.3) 


1 — mj2(@) 


which corresponds to Dempster’s rule of combination using Bayesian belief masses 
with uniform priors. When the prior’s are not uniform, then Dempster’s rule is not 
consistent with Bayes’ Rule. For example, let mo (X) = P(X), mı (X) = P(X | Z,), and 
m (X) = P(X | Z2), then 


mo(X) m (X) m(X) P(X) P(X|Zs) P(X|Za) 


Thus, methods are needed to deal with nonuniform priors and appropriately 
redistribute the conflicting masses. 
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5.4.2 Proportional Conflict Redistribution 


Recent advances in DS methods include Dezert-Smarandache Theory (DSmT). 
DSmT is an extension to the Dempster-Shafer method of ER which has been 
detailed in numerous papers and texts [86]. In [87] are introduced the methods for 
reasoning and presented the hyper power-set notation for DSmT [88]. Recent 
applications include the DSmT Proportional Conflict Redistribution rule 5 (PCR5) 
applied to target tracking [89]. 

The key contributions of DSmT are the redistributions of masses such that no 
refinement of the frame © is possible unless a series of constraints are known. For 
example, Shafer’s model [90] is the most constrained DSm hybrid model in DSmT. 
Since Shafer’s model, authors have continued to refine the method to more pre- 
cisely address the combination of conflicting beliefs [91] and generalization of the 
combination rules [92, 93]. An adaptive combination rule [94] and rules for 
quantitative and qualitative combinations [95] have been proposed. Recent exam- 
ples for sensor applications include electronic support measures, [96], physiological 
monitoring sensors [97], and seismic-acoustic sensing [98]. 

Here we use the Proportional Conflict Redistribution rule no. 5 (PCR5). We 
replace Smets’ rule [99] by the more effective PCR5 to cyber detection probabil- 
ities. All details, justifications with examples on PCRn fusion rules and DSm 
transformations can be found in the DSmT compiled texts [86]. A comparison of 
the methods is shown in Fig. 5.9. 
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Fig. 5.9 Comparison of Bayesian, Dempster-Shafer, and PCRS fusion theories 


5 Context Assumptions for Threat Assessment Systems 117 


In the DSmT framework, the PCRS is used generally to combine the basic belief 
assignment (BBAs). PCRS transfers the conflicting mass only to the elements 
involved in the conflict and proportionally to their individual masses, so that the 
specificity of the information is entirely preserved in this fusion process. Let m,(.) 
and m>(.) be two independent BBAs, then the PCRS rule is defined as follows (see 
[86] for full justification and examples): mpcrs(@) = 0 and VX € 2°\{@}, where Ø 
is the null set and 2° is the power set: 


mpcrs(X) = 2 mı (Xı) +m (X2) 
X1;X € 2° 
XiQX = X 
+ ` mı (X1) m (X2) mı (Xı)m (X2 P 
M: mı(Xı)+m(X2) m (X1) +m(X2) 
2 
LX=Ø 


where N is the interesting and all denominators in the equation above are different 
from zero. If a denominator is zero, that fraction is discarded. Additional properties 
and extensions of PCR5 for combining qualitative BBAs can be found in [86] with 
examples and results. All propositions/sets are in a canonical form. 


5.4.3 Threat Assessment from Context 


In this example, we assume that policies of threat analysis are accepted and that the 
trust assessment of must determine whether the dynamic data is trustworthy, 
threatening, or under attack (Assumption 6—quantifiable). The application system 
collects raw measurements on the data situation, such as Boston Bomber activities as 
an attack, (Assumption 2—acquirable). Situation awareness is needed to determine 
the importance of the information for societal safety (Assumption 1—specific). With 
a prior knowledge, data exploitation can be used to determine the situation 
(Assumption 3—use of context data). The collection and processing should be 
consistent for decision making (Assumption 4—reproducible) over the data acqui- 
sition timeline. Finally, the focus of the example is to increase the timeliness of the 
machine fusion result for human decision making (Assumption 5—actionable). 
Conventional information fusion processing would include Bayesian analysis to 
determine the state of the attack. However, here we use the PCRS rule which 
distributes the conflicting information over the partial states. Figure 5.10 shows the 
results for a societal status undergoing changes in the social order such as events 
indicating an attack and the different methods (Bayes, DS, and PCRS) to access the 
threat. An important result is the timeliness of the change in situation state as 
depicted. In the example, there is an initial shock of information that lasts for a brief 
time (time 20-27 s) while the situation is being assessed (threat or no threat); 
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followed by another repeated event (time 40-50 s). As shown the change in state is 
not recorded by Bayes, but the PCR5 denotes the change. After the initial attacks, 
the threat state is revealed (time 70-100 s) from which a Bayesian method starts to 
indicate a change in the threat state. 

Here it is important to note that context is used in the PCRS as the knowledge of 
the first event leads to a contextual change (that is not detected by using Bayes 
Rule). Likewise, the possibility for a state change (unknown unknown) is deter- 
mined from the conflicting data. The conflict used in the example is 20 % which is 
an example where some intelligence agencies are reporting the facts (threat event), 
while others are reporting differently since they cannot confirm the evidence. The 
notional example is only shown to highlight the importance of context. Two cases 
arise: (1) whether the data is directly accessible, hence conflict in reporting, and 
(2) exhaustively modeling all contextual data to be precise is limited—leading to 
some failures. 

Trust is then determined with percent improvement in analysis for the state 
change. Since the classification of attack versus no attack is not consistent, there is 
some conflict in the processing of the measurement data going from an measure- 
ments of attack and vice versa. The constant changing of measurements requires 
acknowledgment of the change. The initial conflict in the reported evidence requires 
the data conflict as measured from which the PCR5 method better characterizes the 
information—leading to improved trust in the fusion result. 

The improvement of PCR5 over Bayes is shown in Fig. 5.11 and compared with 
the modest improvement from DS. The average performance improvement of PCR5 
is 50 % and DS is 1 %, which is data, context, and application dependent. When 
comparing the results, it can be seen that when a system goes from a normal to an 
attack state, PCRS responds quicker in analyzing the attack, resulting in main- 
taining trust in the decision. Such issues of data reliability, statistical credibility, and 
application survivability all contribute to the presentation of information to an 
application-based user. While the analysis is based on behavioral situation 
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awareness, it is important to leverage context, but also be aware when the con- 
textual factors are not complete, hence conflict. 


5.5 Discussion 


The chapter explicitly identified the common assumptions incorporated into com- 
putational decision engines. The assumptions at each functional block propagate 
through the system and dramatically affect the utility of their output. In the case of 
threat assessment, these assumptions could lead to intelligence failures. Context is 
important, but not completely measureable in a timely method. By understanding 
these assumptions, system users can mitigate these pitfalls by employing skepticism 
and confirmation in the results. The notional example provided a method of a 
change in the threat state that would aid in emergency response. 


5.6 Conclusions 


We outlined the analysis of threat assessment given the context of the situation. 
Threat analysis needs were juxtaposed against the assumptions developers use to 
make the computational decision support system tractable. We showed that the 
long-term system goals have some very real near-term realities. We organized the 
contextual information for threat analysis by categorizing six assumptions: 
(1) specific problem, (2) acquirable data, (3) use of context, (4) reproducible 
analysis, (5) actionable intelligence, and (6) quantifiable decision making. Together, 
a notional example was presented to highlight the need for evidence theory (e.g., 
PCR) to deal with conflicting information in building a context assessment. 
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We hope that we enlighten users of tools to question the accuracy and relevance 
of the computer generated analysis. Likewise, we hope that developers better 
understand the user’s needs of these tools in an operational environment. Context 
for threat assessment must be discernible by both the machine and the user. 
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